Infer.NET

String inference

Infer.NET represents distributions on strings via probabilistic automata. Operations on automata are described in Belief Propagation with Strings, with more details in Notes on String operators. The Infer.NET implementation is described in String inference API design. See Hello, Strings! for an example of modelling with uncertain strings.

Viewing automata

You can create a GraphViz file from a SequenceDistribution by calling ToString on the distribution using the SequenceDistributionFormats.GraphViz format and writing to file.

You can use http://graphviz.it/ to view small automata by copying and pasting the string from the file to the window.

To view large automata, install GraphViz, and run dot.exe. The following example creates a jpeg from the file containing the graph description:

dot -Tjpeg myautomaton.txt > myautomaton.jpeg

Reading

Introduction to Automata Theory - a book covering basics
Weighted Finite-State Transducer Algorithms: An Overview - a survey of algorithms on WFSA covering intersection, determinization, weight pushing and minimization.
Finite-State Transducers in Language and Speech Processing - a more detailed survey
Weighted Automata Algorithms - one more survey, with a focus on algorithms

Automata libraries

ASTL - C++ automata library with STL-like design: algorithms are implemented in terms of automata iterators.
Jolt.NET - .NET DFA/NFA library
OpenFST - weighted finite-state transducer C++ library

Useful facts from automata theory

Transducer operations
- Intersection - regular relations aren’t closed under intersection, complement and subtraction [link]
  - What would happen if one applies the joint traversal as for automata intersection?
DFA (deterministic finite automaton) minimization
- Survey
- Basic idea: remove unreachable states, merge indistinguishable states [link, p.159]
- Brzozowski’s algorithm [link]
  - Invert all edges of the DFA to get an NDFA for the reversed language
  - Apply the power set construction to the NDFA to get a minimal DFA for the reversed language
  - Invert the edges of the minimal DFA back and apply the power set construction to get the minimal DFA for the original language
  - Applies to DFA as well as NFA
  - Is claimed to be efficient in practice
  - Can be justified from the algebraic point of view [link]
NFA (non-deterministic finite automaton) minimization
- Can determinize, then minimize
  - Can’t the minimal DFA have even more states than the original NFA?
- Can apply Brzozowski’s algorithm
- Minimization of NFA is PSPACE-complete (that is, very hard) [discussion link]
- What if one gives up on the optimality guarantees? [discussion link]
  - One can introduce a state equivalence relation stronger than the language equality relation and merge states based on it. Examples include equivalence based on simulation relations [link].
  - There exist some attempts to generalize this technique to the weighted case [link, link]
Weighted NFA determinization [link]
- Requires an extension of power set construction since it matters not only in which state you are, but also how much weight you carry
- Extended power set construction may not halt for some automata
  - Some weighted NFA don’t even have an equivalent weighted DFA
  - Weighted NFA is always determinizable if there are no loops
  - Having only self-loops doesn’t save you
- There is a criteria to determine if an automata is determinizable, so-called twins property, but it works only for tropical semirings
  - Apparently, there are works that generalize it to arbitrary semirings [link]
- Weighted NFA over a tropical semiring can be made determinizable by adding new transitions with special symbols [link]
- There exist algorithms for approximate determinization [link]
- Disambiguation is an alternative. It applies to a wider class of automata and can lead to exponentially smaller results [link]. There are algorithms for testing it [link]
Weighted DFA minimization [link] ○ Push weights (corresponds to global probability normalization in our case) ○ Minimize the automata using the regular DFA minimization algorithm, considering (weight, label) as a single edge label
Computing mode
- Is easy for Viterbi semiring, but NP-hard for the probabilistic semiring [link]
- Should be easy for unambiguous automata (see determinization)
- There are some works on the algorithms for the general case
Learning probabilistic automata from samples
- A WFA-based distribution over sequences is identifiable in the limit from samples [link]
- Learning can be seen as a matrix completion problem [link]
- There was a competition, lots of datasets and results are available [link]
Distances between automata
- KL-divergence between unambiguous (no two successful paths are labelled with the same string) probabilistic WFSA can be efficiently computed [link]
- L2 distance between arbitrary weighted automata can be computed in polynomial time [link]